Manifold-Ranking Based Topic-Focused Multi-Document Summarization

نویسندگان

  • Xiaojun Wan
  • Jianwu Yang
  • Jianguo Xiao
چکیده

Topic-focused multi-document summarization aims to produce a summary biased to a given topic or user profile. This paper presents a novel extractive approach based on manifold-ranking of sentences to this summarization task. The manifold-ranking process can naturally make full use of both the relationships among all the sentences in the documents and the relationships between the given topic and the sentences. The ranking score is obtained for each sentence in the manifold-ranking process to denote the biased information richness of the sentence. Then the greedy algorithm is employed to impose diversity penalty on each sentence. The summary is produced by choosing the sentences with both high biased information richness and high information novelty. Experiments on DUC2003 and DUC2005 are performed and the ROUGE evaluation results show that the proposed approach can significantly outperform existing approaches of the top performing systems in DUC tasks and baseline approaches.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Graph-Based Multi-Modality Learning for Topic-Focused Multi-Document Summarization

Graph-based manifold-ranking methods have been successfully applied to topic-focused multi-document summarization. This paper further proposes to use the multi-modality manifold-ranking algorithm for extracting topic-focused summary from multiple documents by considering the within-document sentence relationships and the cross-document sentence relationships as two separate modalities (graphs)....

متن کامل

Using Syntactic and Shallow Semantic Kernels to Improve Multi-Modality Manifold-Ranking for Topic-Focused Multi-Document Summarization

Multi-modality manifold-ranking is recently used successfully in topic-focused multi-document summarization. This approach is based on Bag-Of-Words (BOW) assumption where the pair-wise similarity values between sentences are computed using the standard cosine similarity measure (TF*IDF). However, the major limitation of the TF*IDF approach is that it only retains the frequency of the words and ...

متن کامل

Query-focused Multi-Document Summarization: Combining a Topic Model with Graph-based Semi-supervised Learning

Graph-based learning algorithms have been shown to be an effective approach for query-focused multi-document summarization (MDS). In this paper, we extend the standard graph ranking algorithm by proposing a two-layer (i.e. sentence layer and topic layer) graph-based semi-supervised learning approach based on topic modeling techniques. Experimental results on TAC datasets show that by considerin...

متن کامل

TMSP: Topic Guided Manifold Ranking with Sink Points for Guided Summarization

Guided summarization is an extension of query-focused multidocument summarization. We proposed a novel ranking algorithm, Topic Guided Manifold Ranking with Sink Points (TMSP) for guided summarization tasks of TAC2010. TMSP is a topic extended version of Manifold Ranking with Sink Points (MRSP), which handles the Update Summarization tasks of TAC2009 well. We adopt the TMSP and MRSP methods to ...

متن کامل

Iterative Feedback Based Manifold-Ranking for Update Summary

update summary as defined for the DUC2007 new task aims to capture evolving information of a single topic over time. It delivers focused information to a user who has already read a set of older documents covering the same topic. This paper presents a novel manifold-ranking frame based on iterative feedback mechanism to this summary task. The topic set is extended by using the summarization of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007